Normalizing Early English Letters for Neologism Retrieval

نویسندگان

  • Mika Hämäläinen
  • Tanja Säily
  • Eetu Mäkelä
چکیده

Our project studies social aspects of innovative vocabulary use in early English letters. In this abstract we describe the current state of our method for detecting neologisms. The problem we are facing at the moment is the fact that our corpus consists of non-normalized text. Therefore, spelling normalization is the first step we need to solve before we can apply automatic methods to the whole corpus.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Theories of Opiate Addiction in the Early Works of Burroughs and Trocchi

In his article "Theories of Opiate Addiction in the Early Works of Burroughs and Trocchi" Richard English discusses William S. Burroughs's and Alexander Trocchi's representations of opiate addiction with special reference to their early writings. English examines the concept of homo heroin that can be attributed to Burroughs and lists and expounds the qualities he adduces. Among these are: immo...

متن کامل

Investigating the Relationship Between Iranian EFL Learners’ Use of Strategies in Collocating Words and Their Proficiency Level

This study investigated the relationship between Iranian EFL learners’ use of strategies in producing English collocations and their proficiency level. Participants were 115 undergraduate university students at 3 proficiency levels, that is, low, intermediate, and high, majoring in English language at the Faculty of Letters and Humanties at Shahid Chamran University of Ahvaz, Iran. Their select...

متن کامل

Neoveille, a Web Platform for Neologism Tracking

This paper details a software designed to track neologisms in seven languages through newspapers monitor corpora. The platform combines state-of-the-art processes to track linguistic changes and a web platform for linguists to create and manage their corpora, accept or reject automatically identified neologisms, describe linguistically the accepted neologisms and follow their lifecycle on the m...

متن کامل

SlangNet: A WordNet like resource for English Slang

We present a WordNet like structured resource for slang words and neologisms on the internet. The dynamism of language is often an indication that current language technology tools trained on today’s data, may not be able to process the language in the future. Our resource could be (1) used to augment the WordNet, (2) used in several Natural Language Processing (NLP) applications which make use...

متن کامل

Multilingual Experiments of UTA at CLEF 2003: The Impact of Different Merging Strategies and Word Normalizing Tools

There are two main translation approaches in a multilingual information retrieval task: either to translate the topics or to translate the datasets. The first one is an easier and more common approach. There are two indexing approaches: either to index the languages in the same index, or to build separate indexes for different languages. If the latter approach is used, retrieved result sets mus...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018